From interactive to semantic image segmentation

نویسنده

Varun Gulshan

چکیده

This thesis investigates two well de ned problems in image segmentation, viz. interactive and semantic image segmentation. Interactive segmentation involves power assisting a user in cutting out objects from an image, whereas semantic segmentation involves partitioning pixels in an image into object categories. We investigate various models and energy formulations for both these problems in this thesis. In order to improve the performance of interactive systems, low level texture features are introduced as a replacement for the more commonly used RGB features. To quantify the improvement obtained by using these texture features, two annotated datasets of images are introduced (one consisting of natural images, and the other consisting of camou aged objects). A signi cant improvement in performance is observed when using texture features for the case of monochrome images and images containing camou aged objects. We also explore adding mid-level cues such as shape constraints into interactive segmentation by introducing the idea of geodesic star convexity, which extends the existing notion of a star convexity prior in two important ways: (i) It allows for multiple star centres as opposed to single stars in the original prior and (ii) It generalises the shape constraint by allowing for Geodesic paths as opposed to Euclidean rays. Global minima of our energy function can be obtained subject to these new constraints. We also introduce Geodesic Forests, which exploit the structure of shortest paths in implementing the extended constraints. These extensions to star convexity allow us to use such constraints in a practical segmentation system. This system is evaluated by means of a robot user to measure the amount of interaction required in a precise way, and it is shown that having shape constraints reduces user e ort signi cantly compared to existing interactive systems. We also introduce a new and harder dataset which augments the existing GrabCut dataset with more realistic images and ground truth taken from the PASCAL VOC segmentation challenge. In the latter part of the thesis, we bring in object category level information in order to make the interactive segmentation tasks easier, and move towards fully automated semantic segmentation. An algorithm to automatically segment humans from cluttered images given their bounding boxes is presented. A top down segmentation of the human is obtained using classi ers trained to predict segmentation masks from local HOG descriptors. These masks are then combined with bottom up image information in a local GrabCut like procedure. This algorithm is later completely automated to segment humans without requiring a bounding box, and is quantitatively compared with other semantic segmentation methods. We also introduce a novel way to acquire large quantities of segmented training data relatively e ortlessly using the Kinect. In the nal part of this work, we explore various semantic segmentation methods based on learning using bottom up superpixelisations. Di erent methods of combining multiple super-pixelisations are discussed and quantitatively evaluated on two segmentation datasets. We observe that simple combinations of independently trained classi ers on single super-pixelisations perform almost as good as complex methods based on jointly learning across multiple super-pixelisations. We also explore CRF based formulations for semantic segmentation, and introduce novel visual words based object boundary description in the energy formulation. The object appearance and boundary parameters are trained jointly using structured output learning methods, and the bene t of adding pairwise terms is quanti ed on two di erent datasets. This thesis is submitted to the Department of Engineering Science, University of Oxford, in ful lment of the requirements for the degree of Doctor of Philosophy. This thesis is entirely my own work, and except where otherwise stated, describes my own research. Varun Gulshan, Brasenose College Copyright c ©2012 Varun Gulshan All rights and lefts reserved.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Hybrid Algorithm based on Deep Learning and Restricted Boltzmann Machine for Car Semantic Segmentation from Unmanned Aerial Vehicles (UAVs)-based Thermal Infrared Images

Nowadays, ground vehicle monitoring (GVM) is one of the areas of application in the intelligent traffic control system using image processing methods. In this context, the use of unmanned aerial vehicles based on thermal infrared (UAV-TIR) images is one of the optimal options for GVM due to the suitable spatial resolution, cost-effective and low volume of images. The methods that have been prop...

متن کامل

Semiautomatic Image Retrieval Using the High Level Semantic Labels

Content-based image retrieval and text-based image retrieval are two fundamental approaches in the field of image retrieval. The challenges related to each of these approaches, guide the researchers to use combining approaches and semi-automatic retrieval using the user interaction in the retrieval cycle. Hence, in this paper, an image retrieval system is introduced that provided two kind of qu...

متن کامل

EEML 2012 – Experimental Economics in Machine Learning

Criteria Formation of Effective High-School Graduates Employment Based upon Data Mining Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 Yuliya Akhmayzyanova and Irina Bolodurina Image Processing Using Dynamical NK-Networks, Consisting of Binary Logical Elements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 D...

متن کامل

Online Random Forest for Interactive Image Segmentation

Many real-world applications require accurate segmentation of images into semantically-meaningful regions. In many cases one needs to obtain accurate segment maps for a large dataset of images that depict objects of certain semantic categories. As current state-of-the art methods for semantic image segmentation do not yet achieve the accuracy required for their use in real-world applications, t...

متن کامل

Interactive multiclass segmentation using superpixel classification

This paper adresses the problem of interactive multiclass segmentation. We propose a fast and efficient new interactive segmentation method called Superpixel Classification-based Interactive Segmentation (SCIS). From a few strokes drawn by a human user over an image, this method extracts relevant semantic objects. To get a fast calculation and an accurate segmentation, SCIS uses superpixel over...

متن کامل

Semantic image understanding : from pixel to word

The aim of semantic image understanding is to reveal the semantic meaning behind the image pixel. We categorize semantic image understanding into two broad categories: pixel-level and image-level semantic image understanding. While pixel-level image understanding aims to obtain the semantic meaning of each pixel, image-level understanding aims to obtain the semantic meaning of the whole image, ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2011

From interactive to semantic image segmentation

نویسنده

چکیده

منابع مشابه

A Hybrid Algorithm based on Deep Learning and Restricted Boltzmann Machine for Car Semantic Segmentation from Unmanned Aerial Vehicles (UAVs)-based Thermal Infrared Images

Semiautomatic Image Retrieval Using the High Level Semantic Labels

EEML 2012 – Experimental Economics in Machine Learning

Online Random Forest for Interactive Image Segmentation

Interactive multiclass segmentation using superpixel classification

Semantic image understanding : from pixel to word

عنوان ژورنال:

اشتراک گذاری